The dataset that I chose to analyze is contains data with items that have been checked out at least 5 times, and are of the Book medium(book, audiobook, ebook) the reason for this is because, I wanted to check how the pandemic had an effect on checkout behavior, so we are going to be looking at trends on which medium got checkout the most, and the trends over time of the authors with the most checkuts. At the end since I am a harry potter fan we are going to analyze harry potters checkout numbers
The Seattle Public Library dataset provides insightful values for visitors’ preferences, such as the authors, publishers, and books that got checked out the most, as well as annual and monthly check out trends.
The most frequently checked out author is Mo Williams, with a total of 326710 checkouts.
The book item that has been checked out the most is Educated a memoir, by Tara Westover with an incredible 17,817 total checkouts
The most checked out publisher is Random House, Inc., with a total of 1,830,160 checkouts
Between the year 2005-2023, the year with the most amout of checkouts was 2019, with the total amount of 2,626,271 checkouts
January is the the month with the most amount of checkouts with the total amount of 2,537,907 checkouts
Seattle open data collected and published data from the Seattle public library, which includes the monthly count of checkouts at the library. Although this program to release data from the library only started in 2017, there was an employee at the library who was collecting this data since 2005, to display to visitors in the library what was getting checked out the most, now the data being collected become of an initiative by Barack Obama to have more open data to the public.
The dataset collects usage class (physical or digital, checkout type, material type (book, movie, etc.), check out year and month, total checkout amounts, and the title, creators, and subject of each piece, as well as the publisher and publication year. There are a total of 42 million rows in this dataset, however, this report will focus on checkouts with a book checkout type, and with books that were checked out more than ten times.
When working with this data, analysts need to make sure that they are not excluding medium and creators. They need to make it explicit that they are only working with a certain type of data, to ensure that the data being represent is a completely accurate picture. The main problem with this dataset is that since it’s very large, there can be lots of values that are missing, and since there are so many rows detecting those values is challenging, thus the best way to work with this data is to select a small portion of it and analyze that.
My motivation for the medium checkouts over time was to see how the increasing popularity of audiobooks and ebooks changed checkout behaviors at the SPL. We can see that during the pandemic, physical book checkouts plummeted, and the checkout numbers for audiobooks and ebooks increased dramatically, again suggesting that the pandemic had a tremendous impact on checkout behaviors.
The reason I decided checkout the number for harry potter books was because I enjoyed reading the first three books, and I found that since Harry potter has a huge fan base, fans could be interested on checkout the numbers on their favorite books. The data-visualizations display harry potter book checkouts that were in English